642 research outputs found
Unified Data Management and Comprehensive Performance Evaluation for Urban Spatial-Temporal Prediction [Experiment, Analysis & Benchmark]
The field of urban spatial-temporal prediction is advancing rapidly with the
development of deep learning techniques and the availability of large-scale
datasets. However, challenges persist in accessing and utilizing diverse urban
spatial-temporal datasets from different sources and stored in different
formats, as well as determining effective model structures and components with
the proliferation of deep learning models. This work addresses these challenges
and provides three significant contributions. Firstly, we introduce "atomic
files", a unified storage format designed for urban spatial-temporal big data,
and validate its effectiveness on 40 diverse datasets, simplifying data
management. Secondly, we present a comprehensive overview of technological
advances in urban spatial-temporal prediction models, guiding the development
of robust models. Thirdly, we conduct extensive experiments using diverse
models and datasets, establishing a performance leaderboard and identifying
promising research directions. Overall, this work effectively manages urban
spatial-temporal data, guides future efforts, and facilitates the development
of accurate and efficient urban spatial-temporal prediction models. It can
potentially make long-term contributions to urban spatial-temporal data
management and prediction, ultimately leading to improved urban living
standards.Comment: 14 pages, 3 figures. arXiv admin note: text overlap with
arXiv:2304.1434
EulerNet: Adaptive Feature Interaction Learning via Euler's Formula for CTR Prediction
Learning effective high-order feature interactions is very crucial in the CTR
prediction task. However, it is very time-consuming to calculate high-order
feature interactions with massive features in online e-commerce platforms. Most
existing methods manually design a maximal order and further filter out the
useless interactions from them. Although they reduce the high computational
costs caused by the exponential growth of high-order feature combinations, they
still suffer from the degradation of model capability due to the suboptimal
learning of the restricted feature orders. The solution to maintain the model
capability and meanwhile keep it efficient is a technical challenge, which has
not been adequately addressed. To address this issue, we propose an adaptive
feature interaction learning model, named as EulerNet, in which the feature
interactions are learned in a complex vector space by conducting space mapping
according to Euler's formula. EulerNet converts the exponential powers of
feature interactions into simple linear combinations of the modulus and phase
of the complex features, making it possible to adaptively learn the high-order
feature interactions in an efficient way. Furthermore, EulerNet incorporates
the implicit and explicit feature interactions into a unified architecture,
which achieves the mutual enhancement and largely boosts the model
capabilities. Such a network can be fully learned from data, with no need of
pre-designed form or order for feature interactions. Extensive experiments
conducted on three public datasets have demonstrated the effectiveness and
efficiency of our approach. Our code is available at:
https://github.com/RUCAIBox/EulerNet.Comment: 10 pages, 7 figures, accepted for publication in SIGIR'2
Dense Text Retrieval based on Pretrained Language Models: A Survey
Text retrieval is a long-standing research topic on information seeking,
where a system is required to return relevant information resources to user's
queries in natural language. From classic retrieval methods to learning-based
ranking functions, the underlying retrieval models have been continually
evolved with the ever-lasting technical innovation. To design effective
retrieval models, a key point lies in how to learn the text representation and
model the relevance matching. The recent success of pretrained language models
(PLMs) sheds light on developing more capable text retrieval approaches by
leveraging the excellent modeling capacity of PLMs. With powerful PLMs, we can
effectively learn the representations of queries and texts in the latent
representation space, and further construct the semantic matching function
between the dense vectors for relevance modeling. Such a retrieval approach is
referred to as dense retrieval, since it employs dense vectors (a.k.a.,
embeddings) to represent the texts. Considering the rapid progress on dense
retrieval, in this survey, we systematically review the recent advances on
PLM-based dense retrieval. Different from previous surveys on dense retrieval,
we take a new perspective to organize the related work by four major aspects,
including architecture, training, indexing and integration, and summarize the
mainstream techniques for each aspect. We thoroughly survey the literature, and
include 300+ related reference papers on dense retrieval. To support our
survey, we create a website for providing useful resources, and release a code
repertory and toolkit for implementing dense retrieval models. This survey aims
to provide a comprehensive, practical reference focused on the major progress
for dense text retrieval
MVP: Multi-task Supervised Pre-training for Natural Language Generation
Pre-trained language models (PLMs) have achieved remarkable success in
natural language generation (NLG) tasks. Up to now, most NLG-oriented PLMs are
pre-trained in an unsupervised manner using the large-scale general corpus. In
the meanwhile, an increasing number of models pre-trained with labeled data
(i.e. "supervised pre-training") showcase superior performance compared to
unsupervised pre-trained models. Motivated by the success of supervised
pre-training, we propose Multi-task superVised Pre-training (MVP) for natural
language generation. We collect a large-scale natural language generation
corpus, MVPCorpus, from datasets over diverse NLG tasks. Then we
unify these examples into a general text-to-text format to pre-train the text
generation model MVP in a supervised manner. For each task, we further
pre-train specific soft prompts to stimulate the model's capacity to perform a
specific task. Our MVP model can be seen as a practice that utilizes recent
instruction tuning on relatively small PLMs. Extensive experiments have
demonstrated the effectiveness and generality of our MVP model in a number of
NLG tasks, which achieves state-of-the-art performance on out of
datasets, outperforming BART by and Flan-T5 by .Comment: Accepted by ACL 202
Alleviating the Long-Tail Problem in Conversational Recommender Systems
Conversational recommender systems (CRS) aim to provide the recommendation
service via natural language conversations. To develop an effective CRS,
high-quality CRS datasets are very crucial. However, existing CRS datasets
suffer from the long-tail issue, \ie a large proportion of items are rarely (or
even never) mentioned in the conversations, which are called long-tail items.
As a result, the CRSs trained on these datasets tend to recommend frequent
items, and the diversity of the recommended items would be largely reduced,
making users easier to get bored.
To address this issue, this paper presents \textbf{LOT-CRS}, a novel
framework that focuses on simulating and utilizing a balanced CRS dataset (\ie
covering all the items evenly) for improving \textbf{LO}ng-\textbf{T}ail
recommendation performance of CRSs. In our approach, we design two pre-training
tasks to enhance the understanding of simulated conversation for long-tail
items, and adopt retrieval-augmented fine-tuning with label smoothness strategy
to further improve the recommendation of long-tail items. Extensive experiments
on two public CRS datasets have demonstrated the effectiveness and
extensibility of our approach, especially on long-tail recommendation.Comment: work in progres
- …